1,326 research outputs found
Model Assessment Tools for a Model False World
A standard goal of model evaluation and selection is to find a model that
approximates the truth well while at the same time is as parsimonious as
possible. In this paper we emphasize the point of view that the models under
consideration are almost always false, if viewed realistically, and so we
should analyze model adequacy from that point of view. We investigate this
issue in large samples by looking at a model credibility index, which is
designed to serve as a one-number summary measure of model adequacy. We define
the index to be the maximum sample size at which samples from the model and
those from the true data generating mechanism are nearly indistinguishable. We
use standard notions from hypothesis testing to make this definition precise.
We use data subsampling to estimate the index. We show that the definition
leads us to some new ways of viewing models as flawed but useful. The concept
is an extension of the work of Davies [Statist. Neerlandica 49 (1995)
185--245].Comment: Published in at http://dx.doi.org/10.1214/09-STS302 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Building and using semiparametric tolerance regions for parametric multinomial models
We introduce a semiparametric ``tubular neighborhood'' of a parametric model
in the multinomial setting. It consists of all multinomial distributions lying
in a distance-based neighborhood of the parametric model of interest. Fitting
such a tubular model allows one to use a parametric model while treating it as
an approximation to the true distribution. In this paper, the Kullback--Leibler
distance is used to build the tubular region. Based on this idea one can define
the distance between the true multinomial distribution and the parametric model
to be the index of fit. The paper develops a likelihood ratio test procedure
for testing the magnitude of the index. A semiparametric bootstrap method is
implemented to better approximate the distribution of the LRT statistic. The
approximation permits more accurate construction of a lower confidence limit
for the model fitting index.Comment: Published in at http://dx.doi.org/10.1214/08-AOS603 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
The topography of multivariate normal mixtures
Multivariate normal mixtures provide a flexible method of fitting
high-dimensional data. It is shown that their topography, in the sense of their
key features as a density, can be analyzed rigorously in lower dimensions by
use of a ridgeline manifold that contains all critical points, as well as the
ridges of the density. A plot of the elevations on the ridgeline shows the key
features of the mixed density. In addition, by use of the ridgeline, we uncover
a function that determines the number of modes of the mixed density when there
are two components being mixed. A followup analysis then gives a curvature
function that can be used to prove a set of modality theorems.Comment: Published at http://dx.doi.org/10.1214/009053605000000417 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Estimating the number of classes
Estimating the unknown number of classes in a population has numerous
important applications. In a Poisson mixture model, the problem is reduced to
estimating the odds that a class is undetected in a sample. The discontinuity
of the odds prevents the existence of locally unbiased and informative
estimators and restricts confidence intervals to be one-sided. Confidence
intervals for the number of classes are also necessarily one-sided. A sequence
of lower bounds to the odds is developed and used to define pseudo maximum
likelihood estimators for the number of classes.Comment: Published at http://dx.doi.org/10.1214/009053606000001280 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Comparison of satellite based cloud retrieval methods for cirrus and stratocumulus
One difficulty in using satellite remote sensing data is the spatial variability of cloud properties on scales smaller than most meteorological satellite fields of view (approx. 4 to 8 km). The variation is examined of satellite derived cloud cover as a function of the satellite sensor spatial resolution for seven cloud cover retrieval methods: (1) Reflectance threshold; (2) Temperature threshold; (3) ISCCP; (4) HBTM (Hybrid Bispectral Threshold Method); (5) NCLE; (6) Spatial coherence; and (7) Functional Box Counting. The first two methods are simple single spectral thresholds which specify a satellite pixel as cloud filled if the measured reflectance is greater than the threshold, or if the measured equivalent blackbody temperature is less than the threshold. The next three methods are bispectral, using one visible wavelength window channel and one thermal infrared wavelength window. The final two algorithms rely on the spatial variability within the cloud field to determine cloud cover. Spatial coherence assumes only that the cloud field occurs in a single layer and that the clouds are optically thick in the infrared window. LANDSAT Thematic Mapper (TM) data is used to test the spatial resolution dependence of the cloud algorithms. The ISCCP bispectral threshold applied to the full resolution data is used as the reference or truth cloud cover, after which the retrieval methods are applied to the spatial resolutions. Studies of the fraction of pixels in the scene at cloud edge, and of the profile of reflectance and temperature near cloud edges indicate an uncertainty in the reference cloud fraction of 1 to 5 percent
Disconnected Loop Noise Methods in Lattice QCD
A comparison of the noise variance between algorithms for calculating
disconnected loop signals in lattice QCD is carried out. The methods considered
are the Z(N) noise method and the Volume method. We find that the noise
variance is strongly influenced by the Dirac structure of the operator.Comment: espcrc.sty file needed. Talk presented at Lattice '97, Edinburgh,
Scotlan
- …